A modification of the Lloyd algorithm for k-anonymous quantization

نویسندگان

  • David Rebollo-Monedero
  • Jordi Forné
  • Esteve Pallarès
  • Javier Parra-Arnau
چکیده

We address the problem of designing quantizers that cluster data while satisfying a k-anonymity requirement. A general data compression perspective is adopted, which considers both discrete and continuous probability distributions, and corresponding constraints on both cell sizes and quantizer index probabilities. Potential applications of this problem extend well beyond the important case of microdata anonymization, to include also optimized task allocation under workload constraints. Our contribution is twofold. First and most importantly, we present a theoretical analysis showing the optimality conditions which probability-constrained quantizers must satisfy, thereby theoretically characterizing optimal k-anonymous aggregation as a special case. As a second contribution, inspired by our theoretical analysis, we propose an alternating optimization algorithm for the design of this type of quantizers. Our algorithm is conceptually motivated by the popular Lloyd–Max algorithm for quantization design, originally intended for data compression, also known as the k-means method. Experimental results for synthetic and real data, with mean squared error as a distortion measure, confirm that our method outperforms MDAV, a popular fixed-size microaggregation algorithm for statistical disclosure control. This performance improvement is in terms of data utility, for the exact same k-anonymity constraint, but does come at the expense of higher computational sophistication. 2012 Elsevier Inc. All rights reserved.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

An algorithm for k-anonymous microaggregation and clustering inspired by the design of distortion-optimized quantizers

Article history: Received 27 April 2010 Received in revised form 21 June 2011 Accepted 21 June 2011 Available online 2 July 2011 We present a multidisciplinary solution to the problems of anonymous microaggregation and clustering, illustrated with two applications, namely privacy protection in databases, and private retrieval of location-based information. Our solution is perturbative, is based...

متن کامل

Detection of perturbed quantization (PQ) steganography based on empirical matrix

Perturbed Quantization (PQ) steganography scheme is almost undetectable with the current steganalysis methods. We present a new steganalysis method for detection of this data hiding algorithm. We show that the PQ method distorts the dependencies of DCT coefficient values; especially changes much lower than significant bit planes. For steganalysis of PQ, we propose features extraction from the e...

متن کامل

NGTSOM: A Novel Data Clustering Algorithm Based on Game Theoretic and Self- Organizing Map

Identifying clusters is an important aspect of data analysis. This paper proposes a noveldata clustering algorithm to increase the clustering accuracy. A novel game theoretic self-organizingmap (NGTSOM ) and neural gas (NG) are used in combination with Competitive Hebbian Learning(CHL) to improve the quality of the map and provide a better vector quantization (VQ) for clusteringdata. Different ...

متن کامل

An efficient algorithm for finding the semi-obnoxious $(k,l)$-core of a tree

In this paper we study finding the $(k,l)$-core problem on a tree which the vertices have positive or negative weights. Let $T=(V,E)$ be a tree. The $(k,l)$-core of $T$ is a subtree with at most $k$ leaves and with a diameter of at most $l$ which the sum of the weighted distances from all vertices to this subtree is minimized. We show that, when the sum of the weights of vertices is negative, t...

متن کامل

An Effective Method for Initialization of Lloyd-Max's Algorithm of Optimal Scalar Quantization for Laplacian Source

In this paper an exact and complete analysis of the Lloyd–Max’s algorithm and its initialization is carried out. An effective method for initialization of Lloyd–Max’s algorithm of optimal scalar quantization for Laplacian source is proposed. The proposed method is very simple method of making an intelligent guess of the starting points for the iterative Lloyd–Max’s algorithm. Namely, the initia...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Inf. Sci.

دوره 222  شماره 

صفحات  -

تاریخ انتشار 2013